A Speeding K-means Clustering Method Based on Sampling

doi:10.3969/j.issn.1006-2475.2013.12.007

Abstract

Abstract: To solve problems that traditional K-means clustering algorithm can not solve the large scale dataset clustering, this paper presents a speeding K-means clustering method based on random sampling, called Kmeans_RS clustering algorithm. The working set is selected from the original clustering dataset by random sampling and the traditional K-means clustering method is executed on this working set. Then the center and radius of every cluster is computed and the sampling result is obtained. The last clustering result of all dataset is obtained by measuring the relationship of sampling result and other data to cluster the remaining data. The random sampling way is used in this process and the size of K-means clustering is decreased, so the clustering efficiency is improved largely and it can be used to solve the large scale clustering problems. Simulation results demonstrate that the excellent clustering efficiency is obtained by this parallel speeding K-means method.


Key words: K-means clustering, random sampling, center, radius, working set, efficiency

CLC Number:

null

WANG Xiu-hua . A Speeding K-means Clustering Method Based on Sampling[J]. Computer and Modernization, 2013, 12(12): 27-29.

[1]	ZHOU Cheng-cheng, ZENG Qing-jun, YANG Kang, HU Jia-ming, HAN Chun-wei. EEG Recognition of Motor Imagination Based on Efficiency Channel Attention Module [J]. Computer and Modernization, 2023, 0(12): 19-23.
[2]	WANG Yi-cheng, ZHANG Guo-liang, ZHANG Zi-jie, . Small Object Detection Method Based on Improved YOLOv5 [J]. Computer and Modernization, 2023, 0(05): 100-105.
[3]	MA Yu-juan, HAN Jian-ning, SHI Shao-jie, CAO Shang-bin, YANG Zhi-xiu. Improved Kmeans Segmentation Algorithm for Brain Tumor Based on HMRF [J]. Computer and Modernization, 2023, 0(03): 1-5.
[4]	ZHANG Zi-xuan, SHA Xiu-yan, XIAO Fei, SU Bao-chan, SUI Yu-lu, MENG Zi-chen. Research and Application of Hesitant Fuzzy Canopy-K-means Clustering Algorithm [J]. Computer and Modernization, 2022, 0(11): 17-21.
[5]	RAN Hao-jie, WANG Hong-zhi. Distribution Center Site Selection of Fresh Agricultural Products Based on Improved Simulated Annealing Algorithm [J]. Computer and Modernization, 2022, 0(10): 36-40.
[6]	YANG Hao, GAO Quan-li, LI Xue-hua, ZHAO Hui, JIN Shuai, XU Guo-liang. Named Data Network Cache Optimization Strategy Based on Cache Value [J]. Computer and Modernization, 2022, 0(10): 95-99.
[7]	ZHANG Ling-yun, HAN Ying, ZHANG Kai, LU Hai-peng, DING Yu-jie. Short-term Traffic Flow Prediction Model Based on Deep Learning [J]. Computer and Modernization, 2022, 0(07): 54-60.
[8]	HUANG AN-qi, MIAO Fang, YANG Wen-hui, NI Ya-ting, JIANG Yuan. Design of Structured Data Registration Engine Based on Data Architecture [J]. Computer and Modernization, 2022, 0(05): 82-89.
[9]	SU Hao, DING Sheng, ZHANG Chao-hua, . Ship Object Detection in Any Direction at Sea Based on Active and Transfer Learning [J]. Computer and Modernization, 2021, 0(09): 21-30.
[10]	CHEN Si-yu, ZHUANG Yi, LI Jing. Multi Feature Load Forecasting Model for LSTM Network in Mobile Cloud Computing [J]. Computer and Modernization, 2021, 0(06): 74-85.
[11]	XU Sheng-chao, SONG Juan, PAN Huan. A Detection Approach of Physical Host Status Anomalousness Based on Linear Regression and Least Squares [J]. Computer and Modernization, 2021, 0(05): 105-111.
[12]	LIANG Xin, ZHANG Zhu-hong, . Research on Depth of Oil Well Moving Liquid Surface Based on Short-term Energy and LSTM [J]. Computer and Modernization, 2021, 0(04): 15-19.
[13]	XU Jia-bing, ZHU Hao-chen, YANG Li. Multi-node Failure Repair Algorithm Based on Erasure Code [J]. Computer and Modernization, 2021, 0(03): 18-23.
[14]	WANG Jin , CHEN Xiao-xuan , XU Jia-li , WANG De-juan , YUAN Huan-huan . A Quickly Mapping Method of Low Altitude UAV Images Based on SSEQ Algorithm [J]. Computer and Modernization, 2021, 0(03): 122-126.
[15]	WANG Gui-zhu, LU Ling-yun, LI Xiang. Energy-Efficient Resource Allocation for Hybrid Spectrum Sharing Cognitive Radio Networks [J]. Computer and Modernization, 2020, 0(10): 103-109.

A Speeding K-means Clustering Method Based on Sampling

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments